Concept Formation Using Graph Grammars

نویسندگان

Istvan Jonyer

Lawrence B. Holder

Diane J. Cook

چکیده

Recognizing the expressive power of graph representation and the ability of certain graph grammars to generalize, we attempt to use graph grammar learning for concept formation. In this paper we describe our initial progress toward that goal, and focus on how certain graph grammars can be learned from examples. We also establish grounds for using graph grammars in machine learning tasks. Several examples are presented to highlight the validity of the approach. Introduction Graphs are important data structures because of their ability to represent any kind of data. Algorithms that generate theories of graphs are of great importance in data mining and machine learning. In this paper we describe an algorithm which learns graph grammars—a set of grammar production rules that describe a graph-based database. The goal of our research is to adapt graph grammar learning for concept formation, hoping that the expressive power of graphs and the ability of graph grammars to generalize will turn out to be a powerful learning paradigm. This paper presents initial progress toward that goal and sets the stage for subsequent work. Only a few algorithms exist for inference of graph grammars. An enumerative method for inferring a limited class of context-sensitive graph grammars is due to Bartsch-Spörl (1983). Other algorithms utilize a merging technique for hyperedge replacement grammars (Jeltsch and Kreowski 1991) and regular tree grammars (Carrasco et al. 1998). Our approach is based on a method for discovering frequent substructures in graphs (Cook and Holder 2000). In the following section we discuss different types of graph grammars and argue how they can be useful in machine learning. We then describe the graph grammars we set out to learn and define some terminology. Next, we present a set of examples to provide some visual insight with graph grammars before we describe the algorithm. Then, we present a working example on an artificial domain to better illustrate the algorithm. Next, we discuss the types of grammars the algorithm can learn, as well as point out some of its limitations. We conclude with an overall assessment of the approach and give directions for future work. Graph Grammars and Machine Learning When learning a grammar in general, one has to decide the intended use of the grammar. Grammars have two applications: to parse or to generate a language. Parser grammars are optimized for fast parsing, giving up a little accuracy which results in grammars that over-accept. That is, they will accept “sentences” that are not in the language. Generator grammars trade accuracy for speed as well. As expected, they will not be able to generate the entire language. A grammar that can generate and parse the same language exactly is very hard to design and is usually too big and slow to be practical. In this paper we are addressing the problem of inferring graph grammars from positive examples. Our purpose is to use grammar learning as an approach to data mining, but other uses can also be found. The generated graph grammar will be our theory of the input domain. In machine learning, algorithms in general are attempting to learn theories that can generalize to a certain degree, so that new, unseen data can be accurately categorized. Translated to grammar terms, we would like to learn a grammar that accepts more than just the training language. Therefore, we would like to learn parser grammars, which have the power to express more general concepts than the sum of the positive examples. Grammars can be context-sensitive and context-free. Context-sensitive graph grammars are more expressive and allow the specification of graph transformations, since both sides of the production can be arbitrary graphs. To start with, however, we aimed at learning context-free grammars that have single-vertex non-terminals on the left side of production rules. This is not a serious limitation, especially since the vast majority of graph grammar parsers can only deal with exactly such grammars (Rekers and Schürr 1995). So why learn graph grammars versus textual ones? Textual grammars are also useful, but they are limited to databases that can be represented as a sequence. An example of such a database is a DNA sequence. Most databases, however, have a non-sequential structure, and many have significant structural components. Relational databases are generally good examples, but even more complex information can be represented using graphs. Examples include circuit diagrams and the world-wide web. Graph grammars can still represent the simpler feature vector-type databases as well as sequential databases (like the DNA mentioned previously). Graphs are among the most expressive representations, therefore an algorithm that can learn a theory of a graph would be useful. We have to emphasize that our purpose in learning graph grammars is not to provide an efficient graph parsing algorithm. Graph parsing will be necessary for classifying unseen example graphs, and while the parsing efficiency of the graph grammar will be a concern here, it is not a primary goal of the generalization step. Graph Grammars Before we get into the details of inferring graph grammars, we first give a general overview of the type of grammar we seek to learn. In this paper, and in our research in general, we are concerned with graph grammars of the set theoretic approach, or expression approach (Nagl 1987). In this approach a graph is a pair of sets G = 〈V, E〉 where V is the set of vertices or nodes, and E ⊆ V × V is the set of edges. Production rules are of the form S Æ P, where S and P are graphs. When such a rule is applied to a graph, an isomorphic copy of S is removed from the graph along with all its incident edges, and is replaced with a copy of P, together with edges connecting it to the graph. The new edges are given new labels to reflect their connection to the substructure instance. A special case of the set-theoretic approach is the node-label controlled grammar, in which S consists of a single labeled node (Engelfriet and Rozenberg 1991). This is the type of grammar we are focusing on. In our case, S is always a non-terminal, but P can be any graph, and can contain both terminals and non-terminals. Since we are going to learn grammars to be used for parsing, the embedding function is irrelevant. External edges that are incident on a vertex in the subgraph being replaced (P) always get reconnected to the single vertex S. Recursive productions are of the form S Æ P S. The non-terminal S is on both sides of the production, and P is linked to S via a single edge. The complexity of the algorithm is exponential in the number of edges considered between recursive instances, so we limit the algorithm to one for now. If the grammar is used for graph generation, this rule will generate an infinitely long sequence of the graph P. If the language is to be finite, a stopping alternative production is required. One such production is S Æ P S | ∅, which reads “replace S with P S or nothing.” For our purposes, however, we use the production S Æ P S | P. The rule S Æ P S | ∅, when used for parsing, would imply that nothing can be replaced with S, introducing an arbitrary number of S’s. At the same time, it cannot parse a chain of P’s of finite length as it would have no starting point, since P S does not exist in the input graph. Remember that the stopping alternative of a graph generator rule is the starting point of a parser rule. When parsing a graph, we start from the complete graph and work towards a single nonterminal. This is done by removing subgraphs from the graph that match the right side of a production and inserting the non-terminal on the left side—in our example, replace P S with S, and finally, P with S. An example of a recursive production is shown in Figure 1c, (S1). Alternative productions are of the form S Æ P1 | P2 | ... | Pn. The non-terminal graph S can be thought of as a variable having possible values P1, P2, ..., Pn. We will sometimes refer to such an S as a variable non-terminal, or simply variable. If S is a single vertex, and Pi are also single vertices, then S is synonymous with a regular non-graph variable. Its values are the vertex labels, which can be alphanumeric values like numbers (discrete or continuous) or string descriptions. An example of a variable is shown in Figure 1c, where S2 has possible values ‘c’ and ‘d’. Examples Before presenting the algorithm, a couple of examples are given here to further clarify what we are trying to accomplish. The first example is suggested by the authors of Sequitur (NevillManning and Witten 1997). Sequitur infers compositional hierarchies from strings. It detects repetition and factors it out by forming rules in a grammar. The example string to be analyzed is “abcabdabcabd”. The grammar generated by Sequitur is shown in Figure 1a. (Non-terminals are in italic bold font.) Our algorithm, called SubdueGL, learns graph grammars; therefore, the input has to be in a graph format. This sequential data was represented by a series of vertices having labels according to the example, connected by single edges, as shown in Figure 1b. The graph grammar learned by SubdueGL is shown in Figure 1c, while its sequential interpretation is shown in Figure 1d. The first obvious difference is that SubdueGL is able to learn recursive grammars. SubdueGL’s version of the grammar is also more general, since it would parse a string of any length, and the letters ‘c’ and ‘d’ do not have to follow in the same order. This example will be referenced in the next section where we describe the algorithm. The next example is a variation of the previous one, with an ‘x’ slightly breaking the regularity in the pattern: “abcabdxabcabd”. The grammar learned by Sequitur is shown in Figure 2a and is very similar to the previous one in Figure 1a. SubdueGL, however, added an extra production to its grammar, resulting in the grammar shown in Figure 2b. Figure 2 Grammars by a) Sequitur and b) SubdueGL learned from “abcabdxabcabd”. S 1 x 1 1 2 c 2 d 2 a b a) S1 a b S2 S1 | a b S2 S2 c | d S3 S1 x S1 b) Figure 1 First example: a) Grammar by Sequitur b) Input graph to SubdueGL c) Graph grammar by SubdueGL d) Equivalent string grammar S 1 1 1 2 c 2 d 2 a b a) S1 a b S2 S1 | a b S2 S2 c | d d) a b c d a b a b c d a b b)

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Attributed Context-Sensitive Graph Grammars

The paper introduces a concept of attributed context-sensitive graph grammars. The graph grammars are a graphical generalization of the textual grammars and can thus be used to specify the syntax of graphical programming or modeling languages. The attributed graph grammars extend the basic graph grammars with definitions of attributes and the associated attribute evaluation rules. By analogy to...

متن کامل

Hyperedge Replacement, Graph Grammars

In this survey the concept of hyperedge replacement is presented as an elementary approach to graph and hypergraph generation. In particular, hyperedge replacement graph grammars are discussed as a (hyper)graph-grammatical counterpart to context-free string grammars. To cover a large part of the theory of hyperedge replacement, structural properties and decision problems, including the membersh...

متن کامل

Modeling Design Knowledge on Structure

This paper is on the modeling of design knowledge. It introduces the concept of design graph grammars, which is an advancement of classical graph grammar approaches. Design graph grammars, as proposed here, provide an efficient concept to create and to manipulate structure models of technical systems. This is interesting from two points of view. Firstly, within many existing tools for design su...

متن کامل

Completeness and Correctness of Model Transformations based on Triple Graph Grammars with Negative Application Conditions (Long Version)

Model transformations are a key concept for modular and distributed model driven development. In this context, triple graph grammars have been investigated and applied to several case studies and they show a convenient combination of formal and intuitive specification abilities. Especially the automatic derivation of forward and backward transformations out of just one specified set of rules fo...

متن کامل